Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems
نویسندگان
چکیده
We consider the problem of controlling a Linear Quadratic Regulator (LQR) system over finite horizon $T$ with fixed and known cost matrices $Q,R$, but unknown non-stationary dynamics $\{A_t, B_t\}$. The sequence can be arbitrary, total variation, $V_T$, assumed to $o(T)$ controller. Under assumption that stabilizing, potentially sub-optimal controllers is available for all $t$, we present an algorithm achieves optimal dynamic regret $\tilde{\mathcal{O}}\left(V_T^{2/5}T^{3/5}\right)$. With piece-wise constant dynamics, our $\tilde{\mathcal{O}}(\sqrt{ST})$ where $S$ number switches. crux adaptive non-stationarity detection strategy, which builds on approach recently developed contextual Multi-armed Bandit problems. also argue non-adaptive forgetting (e.g., restarting or using sliding window learning static size) may not LQR problem, even when size optimally tuned knowledge $V_T$. main technical challenge in analysis prove ordinary least squares (OLS) estimator has small bias parameter estimated non-stationary. Our highlights key motif driving spirit bandit linear feedback locally quadratic cost. This more universal than itself, therefore believe results should find wider application.
منابع مشابه
a new type-ii fuzzy logic based controller for non-linear dynamical systems with application to 3-psp parallel robot
abstract type-ii fuzzy logic has shown its superiority over traditional fuzzy logic when dealing with uncertainty. type-ii fuzzy logic controllers are however newer and more promising approaches that have been recently applied to various fields due to their significant contribution especially when the noise (as an important instance of uncertainty) emerges. during the design of type- i fuz...
15 صفحه اولCONTROL OF CHAOS IN A DRIVEN NON LINEAR DYNAMICAL SYSTEM
We present a numerical study of a one-dimensional version of the Burridge-Knopoff model [16] of N-site chain of spring-blocks with stick-slip dynamics. Our numerical analysis and computer simulations lead to a set of different results corresponding to different boundary conditions. It is shown that we can convert a chaotic behaviour system to a highly ordered and periodic behaviour by making on...
متن کاملDynamic Thresholding and Pruning for Regret Minimization
Regret minimization is widely used in determining strategies for imperfect-information games and in online learning. In large games, computing the regrets associated with a single iteration can be slow. For this reason, pruning – in which parts of the decision tree are not traversed in every iteration – has emerged as an essential method for speeding up iterations in large games. The ability to...
متن کاملcontrol of chaos in a driven non linear dynamical system
we present a numerical study of a one-dimensional version of the burridge-knopoff model [16] of n-site chain of spring-blocks with stick-slip dynamics. our numerical analysis and computer simulations lead to a set of different results corresponding to different boundary conditions. it is shown that we can convert a chaotic behaviour system to a highly ordered and periodic behaviour by making on...
متن کاملRegret Bounds for the Adaptive Control of Linear Quadratic Systems
We study the average cost Linear Quadratic (LQ) control problem with unknown model parameters, also known as the adaptive control problem in the control community. We design an algorithm and prove that apart from logarithmic factors its regret up to time T is O( √ T ). Unlike previous approaches that use a forced-exploration scheme, we construct a high-probability confidence set around the mode...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ACM on measurement and analysis of computing systems
سال: 2022
ISSN: ['2476-1249']
DOI: https://doi.org/10.1145/3508029